Key Details:
- Direct Hire/Perm
- Location: San Diego, CA - Hybrid
- Pay: $150-180k + 10% bonus
- Must be eligible to work in the US without sponsorship now or in the future.
Summary:
We are seeking a technically fluent, AI-forward data engineer and solutions architect to build and own the next-generation data and AI platform that powers how our organization accesses, analyzes, and acts on commercial real estate data. This role is the builder behind our "talk to your data" initiative.
You will architect and implement Microsoft Fabric/OneLake as our centralized data platform, build governed ingestion pipelines from Yardi, SharePoint, CoStar, GreenStreet, Argus, and other CRE systems, and develop AI agents and natural language interfaces that enable business users to query the portfolio without writing a single line of code or opening a dashboard. You will work directly with IT, FP&A, and other stakeholders to define data requirements and ensure the platform delivers reliable, governed, AI-ready data.
The ideal candidate is a builder, comfortable in Python, SQL, and vector databases, experienced with enterprise cloud data platforms, and an evangelist for the intersection of AI and structured business data. CRE knowledge is a plus, not a prerequisite; the ability to learn a domain quickly and translate it into data architecture is required.
Primary Responsibilities:
- Architect and implement Microsoft Fabric/OneLake as our centralized, cloud-native data platform.
- Build governed ingestion pipelines from Yardi and DataFreedom, SharePoint, Argus, CoStar, GreenStreet, Measurabl, and other CRE data sources.
- Design and deploy AI agents and natural language interfaces using the Microsoft Copilot and Azure AI stack governed within our Microsoft 365 tenant.
- Implement schema validation, business rule enforcement, and data quality monitoring at the ingestion layer.
- Serve as the platform engineering partner to analysts, define and maintain the data foundation that BI reporting consumes.
- Champion AI and emerging technology adoption within enterprise-governed, security-first boundaries.
- Maintain access controls, audit trails, and data governance standards across all platform components.
- Train end-users on AI tools and natural language interfaces
Primary Duties:
(The below list is not comprehensive, and priorities will change on a day to day basis)
Data Infrastructure & Integration (35-45%)
- Architect and implement Microsoft Fabric/OneLake as our centralized, cloud-native data platform including Lakehouses, Delta tables, Dataflow Gen2 pipelines, and DirectLake semantic model configuration.
- Build governed ingestion pipelines from Yardi ERP (via DataFreedom SQL abstraction layer and/or MCP), SharePoint (Excel templates, PDFs), Argus, CoStar, GreenStreet, Measurabl, and other CRE data sources.
- Implement schema validation, business rule enforcement, and data quality monitoring at the ingestion layer, ensuring that only validated, approved data is promoted to the authoritative OneLake layer.
- Design data models that normalize and contextualize information across disparate systems, aligned with the DataFreedom property master as the referential integrity anchor.
- Establish and maintain data governance standards, documentation, data lineage tracking, and change management processes.
- Migrate Power BI semantic model from on-premises gateway (WDSSRV) to cloud-native Fabric refresh eliminating single point of failure.
AI Agent Development & Automation (35-45%)
- Design and deploy AI agents and natural language interfaces using the Microsoft Copilot and Azure OpenAI stack (Azure AI Search, Azure OpenAI Service, Copilot Studio, Power Automate), governed within our Microsoft 365 tenant with no external data transmission.
- Build and maintain the Ask AI agents (Teams + Outlook integration), connecting it to the Azure AI Search vector index over OneLake Delta tables and SharePoint metadata-tagged documents.
- Build API integrations and custom connectors enabling AI systems to query structured (OneLake) and unstructured (SharePoint metadata-tagged documents) data sources.
- Develop automated workflows for recurring report distribution, data collection, validation notifications, and business processes using Power Automate and Python.
- Evaluate, pilot, and operationalize emerging AI capabilities within the Microsoft 365 and Azure ecosystem, maintaining an enterprise-governed, security-first approach.
Platform Security & Governance (Ongoing)
- Implement and maintain access controls for OneLake workspaces and Azure AI Search indexes, inheriting Entra ID security groups and Conditional Access policies.
- Ensure all AI-generated outputs are scoped by user authorization, security trimming enforced at the query layer, not just the UI.
- Maintain audit trail integrity via Delta table versioning in OneLake.
- Work with IT leadership to ensure all data pipelines, AI agents, and integrations comply with our data governance, privacy, and security standards.
Business Intelligence Support (10-20%)
- Serve as the data platform partner to the Senior BI Analyst, define upstream data requirements, validate semantic model inputs, and resolve data quality issues at the source.
- Maintain and enhance existing Power BI dashboards as needed during platform transition.
- Support Copilot for Power BI enablement on the Fabric-hosted semantic model.
- Train end-users on AI tools, natural language interfaces, and new platform capabilities
Desired Experience:
- Bachelor's degree in Computer Science, Information Systems, Data Engineering, Data Analytics, or related field.
- 3-6 years of professional experience in data engineering, analytics engineering, or a BI/data platform role with a strong engineering component.
- Demonstrated experience building production data pipelines on cloud-native platforms (Microsoft Fabric, Azure Data Lake, Databricks, or equivalent).
- Hands-on experience building AI-powered applications or agents using LLMs, beyond basic prompting, including RAG architecture, vector search, API integration, or Copilot Studio/custom agent development.
- Commercial real estate, real estate finance, or related industry experience preferred but not required.
- Strong understanding of data architecture, data modeling, and ETL/ELT processes.
- Experience building API integrations and automated workflows.
Desired Technical Skills:
- Microsoft Fabric / OneLake: Lakehouse architecture, Dataflow Gen2, Fabric Pipelines, Delta tables, DirectLake semantic model configuration.
- SQL: Advanced query writing, data modeling, understanding of relational and columnar database structures.
- Python: Data engineering and automation scripting (pandas, requests, pyarrow, API integration); ability to write clean, documented, maintainable code.
- AI Platforms: Hands-on experience with Azure OpenAI Service, Azure AI Search (vector + hybrid search), Copilot Studio, or equivalent enterprise-governed LLM platforms.
- Power BI: Dashboard development, DAX, Power Query, semantic model administration.
- Power Automate: Complex workflow automation and system integration.
- APIs: REST API design and consumption; ability to build and maintain custom connectors.
- Version Control: Git/GitHub.
Strongly Preferred:
- Experience with SharePoint Premium (Syntex / Content AI): document classification, metadata extraction, content processing.
- Familiarity with CRE data systems: Yardi, DataFreedom, CoStar, Argus, GreenStreet, or Measurabl.
- Understanding of enterprise data governance frameworks, data lineage, and stewardship models.
- Experience with Azure Entra ID, security group-based access control, and data security trimming in AI search contexts.
- Microsoft Fabric DP-600 certification (or willingness to obtain).
Core Competencies:
- Builder mindset: move from whiteboard to working system without perfect requirements.
- Security and governance forward: design access controls and audit trails in.
- Domain learner: curiosity and the ability learn and translate CRE business context into data architecture are required.
- Cross-functional collaborator: work closely with multiple stakeholders
- Translate technical constraints into plain language.
- Disciplined and documented: pipelines, agents, and integrations can be maintainable by others; documentation is part of the deliverable.
All qualified applicants will receive consideration for employment without regard to race, color, national origin, age, ancestry, religion, sex, sexual orientation, gender identity, gender expression, marital status, disability, medical condition, genetic information, pregnancy, or military or veteran status. We consider all qualified applicants, including those with criminal histories, in a manner consistent with state and local laws, including the California Fair Chance Act, City of Los Angeles' Fair Chance Initiative for Hiring Ordinance, and Los Angeles County Fair Chance Ordinance.